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Moment approximation methods are gaining increasing attention for their use in the approximation of the 
stochastic kinetics of chemical reaction systems. In this paper we derive a general moment expansion method 

for any type of propensities and which allows expansion up to any number of moments. For some chemical 
reaction systems, more than two moments are necessary to describe the dynamic properties of the system, 
which the linear noise approximation (LNA) is unable to provide. Moreover, also for systems for which 
the mean docs not have a strong dependence on higher order moments, moment approximation methods 
give information about higher order moments of the underlying probability distribution. We demonstrate the 
method using a dimerisation reaction, Michaelis-Menten kinetics and a model of an oscillating p53 system. We 
show that for the dimerisation reaction and Michaelis-Menten enzyme kinetics system higher order moments 
have limited influence on the estimation of the mean, while for the p53 system, the solution for the mean can 
require several moments to converge to the average obtained from many stochastic simulations. We also find 
that agreement between lower order moments does not guarantee that higher moments will agree. Compared 
to stochastic simulations our approach is numerically highly efficient at capturing the behaviour of stochastic 
systems in terms of the average and higher moments, and we provide expressions for the computational 
cost for different system sizes and orders of approximation. We show how the moment expansion method 
can be employed to efficiently quantify parameter sensitivity. Finally we investigate the effects of using too 
few moments on parameter estimation, and provide guidance on how to estimate if the distribution can be 
accurately approximated using only a few moments. 

PACS numbers: 82.39.-k, 87.10.Mn, 82.20.Fd, 87.16.A- 



I. INTRODUCTION 

Cellular behaviour is shaped by molecular process that 
can be described by systems of chemical reactions be- 
tween different molecular species. At the macroscopic 
scale, the dynamics of these processes are frequently de- 
scribed in terms of mean concentrations of species us- 
ing deterministic mass action kinetics (MAK). The de- 
terministic solution, however, does not always capture 
the essential kinetics of the chemical system accurately, 
because it excludes stochastic effects. 

The stochastic kinetics of chemical reaction systems 
are captured by the chemical master equation (CME), 
a probabilistic description of the system (also known as 
Kolmogorov's forward equation) The CME is a set 
of differential or difference equations that capture how 
the probability over the states (e.g. the abundances of 
the different molecular species) evolves over time. The 
CME offers an exact description of a system's dynam- 
ics but can only be solved analytically for very simple 
systems. Exact single realisations of the CME, corre- 
sponding e.g. to observing processes inside a single cell, 
can be obtained numerically using for example Gillespie's 
Stochastic Simulation Algorithm (SSA)^. The SSA is a 
discrete simulation method that proceeds by randomly 
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selecting a reaction that occurs at each subsequent time 
step, according to the probability of that reaction occur- 
ring next. This method is associated with considerable 
computational cost, that increases dramatically with the 
size of the model, which can make it infeasible to compre- 
hensively characterize large systems through simulations. 

A number of numerical methods have been developed 
to approximate the solution of the CME for more com- 
plex systems, including methods that approximate the 
CME by describing the probability distribution in terms 
of its moments^^'^"'^. When only the mean (the first mo- 
ment) is taken into account, the moment expansion re- 
duces to the MAK description. In the linear noise ap- 
proximation (LNA), the CME is approximated by tak- 
ing into account the mean (the first moment), and the 
variance and covariance (second central moments) of the 
distribution, whereby the second central moments are de- 
coupled form the mean^^~^^. This is valid in the limit of 
large volumes and molecule numbers, or when the system 
consists of first-order reactions only. 

For smaller systems and more complex reactions, a 
number of different approaches have been developed that 
aim to capture the temporal changes in coupled mo- 
ments. These expansions can be performed in terms of 
the moments or the central moments about the mean; 
the conversion from molecule numbers can be kept im- 
plicit in the rate parameters or made more explicit from 
the outset. Previous developments focussed on expan- 
sion methods for polynomials*'"^^'^^, or expansions up to 
a limited order. Gillespie et al. developed an expansion 
method in terms of moments for polynomial functions^; 
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this was later generalized to include rational functions 
as well^^. Other alternative methods make the depen- 
dence on system size more obvious and include so-called 
Mass Fluctuation Kinetics (MFK)^, and the 2MA and 
3MA approaches^®. In the Mass Fluctuation Kinetics 
approach the expansion is done in terms of central mo- 
ments up to third order for first and second order rate 
equations, and up to second order for general rate equa- 
tions. In a recent paper^*^ it was shown that expansion up 
to three moments tends to deliver more accurate results 
than expansion up to only second order; in particular 
the variances are improved by going to higher moments. 
While the MFK, 2MA and 3MA approaches include ex- 
pressions for the first, second and third central moments, 
no general formulation exists in these frameworks to gen- 
erate higher order central moments in an automatic way. 

In this paper we derive a general method for expanding 
the CME in terms of its central moments about the mean, 
which does not make extraneous assumptions about the 
form of the reactions, and that can be evaluated up to any 
number of moments; in our exposition we follow the nota- 
tion used in Gillespie ^ . The new method described here 
non-trivially generalizes the work of Gomez et al.^: in 
particular we are able to generate arbitrary order higher 
central moments automatically and in a computationally 
efficient manner; combining computer algebra systems 
with numerical simulation engines does allow us to tackle 
problems that stymy e.g. the linear noise approximation 
or conventional (3rd order) MFK approaches. The in- 
terplay between noise and non-linear dynamics can give 
rise to very complicated behaviour, and only a handful of 
systems have been considered. The moments of the dis- 
tribution described by the CME reflect this intricate re- 
lationship between stochasticty and non-linearity and we 
spend some time discussing this for a typical non-linear 
biomolecular system. Thus while going to arbitrary mo- 
ments is straightforward in our framework, we focus our 
discussion on the lower (up to sixth order) moments. 

Our expansion is based on two successive Taylor ex- 
pansions that allow us to express the CME of the mo- 
ment generating function in computationally convenient 
form; and we truncate the system by setting higher or- 
der terms in the Taylor expansions to zero. The outlined 
method coiild fit into a general framework for parame- 
ter inference based on maximum entropy distributions 
derived from the calculated moments. We furthermore 
demonstrate that parameter sensitivity analyses may be 
performed naturally and efficiently in this framework: we 
can consider the rate of change of the moments with re- 
spect to the parameters, which also allows us to study 
the factors underlying cell-to-cell variability. We illus- 
trate the general method using three molecular reaction 
systems: a simple dimerisation reaction, which allows for 
a detailed investigation; Michaelis-Menten enzyme kinet- 
ics; and the p53 system, an oscillating tumour suppressor 
system^ 



II. MOMENT EXPANSION METHOD 

We consider a system with A'' different molecular 
species [Xi, ...X^) that arc involved in L chemical re- 
actions with reaction rates fc;. 
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s^Xi + ... + s^. 

with Sj and Si the number of molecules of type Xi before 
and after the reaction, respectively. The time evolution 
of the system's state is described by the chemical master 
equation (CME), 

'^^W = Vp(x-sOai(x-s,)-P(x)ai(x), (2) 



dt 
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in which ai{x) are the propensity functions, with ai{x)dt 
defined as the probability of reaction I occurring in an in- 
finitesimal time interval dt when the number of molecules 
in the system is x , P(x) the probability that the system 
contains x molecules and s; = — S;. 

We start the derivation of the moment expansion 
method by deriving a moment generating function from 
the CME. In general, the moment generating function of 
a random variable x is defined as^ 

m(6i,x) = ^e^^P(x). 



(3) 



The moments, (x"), with x" = a;"\..a;^''', of the proba- 
bility distribution can be found by taking the n-th order 
derivatives of the moment generating function with re- 
spect to 9. The first moment is, of course, equal to the 
mean, fi = (x). The variance, skewncss (related to asym- 
metry of the distribution) and kurtosis (related to the 
chance of outliers) can be derived from the central mo- 
ments about the mean, ((x — /i)"). 

Using the CME we can write the time dependent mo- 
ment generating function^^ 



dm 

~dt 



E 



(4) 



The time evolution of the mean concentration fii of 
species Xj can be obtained by taking the first deriva- 
tive of Eq. 4 with respect to 6i and subsequently setting 
6 to zero, 
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This expression can be evaluated by expanding a;(x) in 
a Taylor expansion about the mean. 
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where S is the stoichiometry matrix, 

gn _ gni + ..+nd 

n! = ni!...nd! 
dx^ = dx1\..dxy, 
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,(6) 
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and we have substituted the central moments around the 
mean for the expected values of the expansion terms 



((xi-Mir-..(x„-M„D. 



The first central moment (x — /x) is zero. Higher order 
central moments can be derived from the moments, for 
example the covariance between xi and X2 can be written 

as 



'^1^X2 = ((a^i - ^^l){x2 - M2)) = 

{X1X2) + - {Xi) (/i2) - (.T2) {ill) . 



(7) 



In general the relation between the central moments, 
Mx", and the moments, /x", can be formulated as 




where 



(ni-fci) 



,Xnd-kd) 



central moments; the time evolution of the central mo- 
ments remains a function of the central moments alone, 
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When the model under investigation is non-linear, each 
central moment will depend on a higher order central 
moment, which may itself also depend on higher order 
moments; hence the number of equations we would need 
to include is in principle infinite. To overcome this, we 
can obtain a closed set of equations by evaluating the 
time evolution equations for u moments and truncating 
the Taylor series after the vth order, thereby setting all 
higher order central moments equal to zero. By trun- 
cating the Taylor expansion (i.e. setting terms of the 
Taylor expansion corresponding to ^n, > 1/ to 0), the 
equations are only dependent on the central moments up 
to the selected order 1^. Alternatively, the set of equa- 
tions could be closed using moment closure techniques 
based on common expressions for well known probability 
distributions^*. In this paper we will use truncation as 
well as closure based on a Gaussian distribution. 



We obtain the time evolution equations of the central 
moments by taking the time derivative of Eq. 8, 
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The term a makes the time evolution of the central mo- 
ments a function of Eq. 6, 
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The term j3 gives rise to mixed moments, and the deriva- 
tive of P with respect to time yields the time evolution 
equations for the mixed moments. Therefore we also need 
to include the time derivatives of the mixed moments in 
our system of equations. We obtain the time derivative 
of the term (x*') by taking higher order derivatives of the 
moment generating function Eq. 4, resulting in 
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x^^-^^a{x)), (11) 



where 



e ei 
S = Si ...s 



x^^-^^ = x'^'-^\..x 
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By expanding the individual terms of the resulting ex- 
pressions in a second Taylor expansion, the time evolu- 
tion of the mixed moments becomes a function of the 



III. RESULTS 

We illustrate the approach in a range of applications 
that serve to highlight both the basic properties of the 
moment expansion method as well as the advantages this 
method offers in situations where other methods^^'^^'^^ 
tend to fail. 

A. DImerlsation 

We first illustrate the moment approximation method 
using a simple dimerisation reaction^^. The system de- 
scribes the reversible formation and disintegration of a 
dimeric molecule, 



fci. 



(13) 



The system can be written in terms of two propensities 
ai : kixi{xi - 1); 02 : A:2a;2, (14) 
and the stoichiometry matrix 



S = 



-2 2 
1 -1 



(15) 



where the columns correspond to reactions and the rows 
to variables. 

The exact kinetics of the system can be straightfor- 
wardly simulated using the Stochastic Simulation Algo- 
rithm (SSA)^. One realisation of the SSA is equivalent 
to observing the kinetics of the system inside a single 
cell (Figure la), whereas the average over many realisa- 
tions mimics the observation of the average kinetics for a 
large number of cells (Figure lb, n=100,000). The system 
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FIG. 1. Study of dimerization system, initial values x = 
[301,0], parameters k = [1.66- 10"^0.2]. a) Single SSA 
realisation, b) Average of 100,000 SSA simulations, c-d) 
Histogram of SSA runs (grey bars) and probability density 
of normal distribution (blue line) calculated from mean and 
variance of SSA runs corresponding to points c and d in fig- 
ure (b). e) Mean for both variables, calculated using SSA, the 
moment approximation using 1 moment (deterministic) and 
two central moments (2m). f) Variance of x\ calculated using 
SSA, two central moments (2m), three central moments with 
Gaussian closure (3m) and four central moments (4m). g) 
Third central moment calculated using SSA and moment ap- 
proximation method, fourth central moment calculated using 
SSA and moment approximation method. 



reaches a stationary state after about 4 seconds. In Fig. 
Ic-d we plotted histograms of the SSA at two time points 
that correspond to different regimes in the trajectory: 
transient state with decreasing molecule number (Fig. 
Ic) and stationary state (Fig. Id). We calculated the 
normal probability density function from the mean and 
variance of the SSA distribution at those two time points 
(blue line), ignoring higher order moments. The normal 
probability density functions fit the histograms very well, 
which indicates that the distribution of molecule num- 
bers is approximately normal over the time course, and 



TABLE I. Error between mean, second and third central mo- 
ment calculated with SSA and approximation methods for the 
dimerization system. 
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higher order moments would have limited influence on 
describing the kinetics of the mean for this system. 

Figure Id shows the mean molecule numbers calculated 
with the moment approximation method compared to the 
SSA results. The results for the deterministic (including 
only the mean) as well as the two moment approximation 
are approximately equal to the means calculated with the 
SSA. We compare the higher order central moments cal- 
culated with the general moment approximation method 
described above with the results from the SSA simula- 
tions in Figures If-h. In the 3-moment approximation we 
closed the equations using the Gaussian probability dis- 
tribution (setting the fourth cumulant equal to zero^°), 
while in the other approximations we truncated higher 
order moments. The approximated moments are close to 
the exact moments calculated from the SSA, which is also 
clear from the er rors displayed in Table I, c alculated as 

e = (I/iV) En ^mSSA-Mma)/MsSAY with MgSA 

the moment or central moment calculated based on the 
SSA, Mma the corresponding value calculated with the 
moment approximation, and A'' the number of time points 
taken into account. The larger error for the mean when 
using the deterministic approach is due to small differ- 
ences in the first part of the trajectory, the decreasing 
part, where the contributions of the higher order central 
moments are largest. The larger errors calculated for the 
third central moment are due to the fluctuations that are 
still present in the third central moment trajectory calcu- 
lated based on 100,000 SSA runs. Increasing the number 
of SSA runs would reduce this effect. 



B. Michaelis-Menten enzyme kinetics 

We next look at Michaelis-Menten enzyme kinetics, 
where an enzyme, and substrate, 5", first bind to form 
a complex, SE] following this, the complex can dissoci- 
ate into the original components S and -E, or S can be 
converted into the product, P, 
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FIG. 2. Study of Michaelis-Menten kinetics, with parame- 
ters k = [1.66 ■ 10~^,1 • 10~*,0.01], and initial conditions, 
3(0) = 301, P(0) = 0, and E{0) = 120. (a) Single SSA reali- 
sation. Trajectories calculated using (b) moment approxima- 
tion including only the mean (deterministic), (c-d) Variance 
of S and covariance between S and P calculated using SSA 
and approximation using 2 moments, (e) Skewness of S cal- 
culated using SSA and approximation with 3 moments, (f) 
Kurtosis calculated using SSA and approximation up to 4 
and 6 moments. 



The system is often reduced to a system of two variables 
{S and P)^^, with three reaction propensities 

ai : kiS[E{0) - S{Q) + S + P]; 
a2 : k2[S{0) - (S + P)]; 
as : k^iSiO) - {S + P)] 



and the stoichiometry matrix 



S = 



-1 1 
1 



(17) 



One trajectory calculated with Gillespie's Stochastic 
Simulation Algorithm is shown in Figure 2a, and the 
mean calculated by solving the ODE system using the 
deterministic representation of the system is displayed 
in Figure 2b. For this system the deterministic repre- 
sentation is very close to the stochastic solution. The 
variance of the substrate and the covariance between the 
substrate and the product (Figure 2c-d) calculated based 
on 100,000 SSA runs can be closely approximated using 
the moment approximation method expanded up to two 



moments. Figure 2e shows the skewness of the distribu- 
tion over time, calculated as 



7 



rr3 



(18) 



For a normal distribution the skewness is zero. The skew- 
ness is approximated well using the moment approxima- 
tion method up to three moments. The kurtosis, given 

by 
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indicates the thickness of the tails of the probability dis- 
tribution, relating to the probability of outliers. For a 
normal distribution, the kurtosis is equal to 3. When 
four moments are used to approximate the system, the 
approximation of the kurtosis that we obtain from the 
SSA is not as close as when also higher moments, here 
six moments, are included in the calculation. This il- 
lustrates that agreement between lower-order moments 
does not guarantee that higher-order moments will also 
agree. This problem is likely exacerbated for more com- 
plex models, e.g. those exhibiting non-linear dynamics. 



C. P53 system 

Finally, we investigate the oscillatory p53 system^^, 
which consists of three proteins connected through a non- 
linear feedback loop: p53, precursor of Mdm2 and Mdm2. 
The system can be written in terms of six propensities, 



ai : ki; 
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and the stoichiometry matrix 



02 : k2X 
a4 : k4X 
ag : key, 



(20) 
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where x is the concentration of p53, yo the concentration 
of precursor of Mdm2, y is the concentration of Mdm2, ki 
is the p53 production rate, k2 is the Mdm2-indcpendcnt 
p53 degradation rate, ^3 the saturating p53 degradation 
rate, k^ is the p53 threshold for degradation by Mdm2, 
^4 is the p53-dependent Mdm2 production rate, fcs is the 
Mdm2 maturation rate and kg is the Mdm2 degradation 
rate. 

Figure 3a shows an SSA simulation of the model for pa- 
rameters qi =[90,0.002,1.7,1.1,0.93,0.96,0.01] and initial 
values x(0) ~ [70,30,60], which exhibits oscillating be- 
haviour. Because the oscillations for different realisations 
of the SSA (corresponding to different cells) are stochas- 
tically out of phase, the average over 100,000 stochas- 
tic simulations shows a damped oscillation, reflecting the 
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FIG. 3. Study of p53 model, parameter set q\ = 
[90,0.002, 1.7, 1.1,0.93,0.96,0.01]. (a) Single SSA realisation, 
(b) Average of 100.000 SSA simulations. Trajectories cal- 
culated using (c) moment approximation including only the 
mean (deterministic), (d) Linear Noise Approximation, (e) 
mean and the variance, (f) up to three central moments, (g) 
five central moments and (h) six central moments, (i-j) Cu- 
mulative difference between mean trajectory calculated with 
SSA and trajectories calculated using (i) 2 moments and (j) 
6 moments. 

presence of a negative feedback loop. Figures 3c-h show 
the trajectories of the mean calculated with the moment 
approximation method. In the deterministic approxi- 
mation, the oscillations are not damped but expanding. 



which would indicate a positive instead of negative feed- 
back loop. The LNA and the 2 moment approximation 
show the same effect. The mean calculated with the LNA 
is always equal to the mean calculated with the determin- 
istic approximation because the mean is not coupled to 
the variance. When 3 moments arc included, the sys- 
tem shows damped oscillations, although not as damped 
as the SSA trajectories. By including more moments 
the trajectories converge to the SSA trajectories. When 
6 moments are taken into account, the trajectories cal- 
culated with the moment approximation show a similar 
behaviour to the trajectories calculated with the SSA. 
This is confirmed by the cumulative difference between 
the SSA trajectories and the trajectories calculated with 
the moment approximation shown in Figures 3i-j, which 
show a clear decrease in cumulative error for all variables 
when 6 central moments compared to 2 central moments 
are included in the approximation. Including more mo- 
ments would improve the estimation further. 

We analyze the distribution of the p53 model over the 
time course by looking at the central moments (Figure 
4). The variance for the SSA is first increasing, then after 
about 20 hours it levels off. In Figure 4b we compare the 
variance calculated based on the SSA with the LNA and 
moment approximation method. When up to five cen- 
tral moments are included, the variance keeps increasing 
and does not level off. Only for the case of 6 moments 
does the variance reach a stable value, and even then the 
value is about three times higher than that predicted by 
the SSA simulations. Figure 4c shows the comparison 
of the skewness of the SSA distribution to the skewness 
of a normal distribution (7 = 0). For three time points 
where the skewness is relatively large (indicated by d, e, 
f) we display the histogram of the 100,000 SSA realisa- 
tions together with the probability density of the normal 
distribution (cyan line) calculated based on the mean and 
variance of the SSA and the 6 moment approximation. 
Additionally, we plot the skew-normal distribution calcu- 
lated using the mean, variance and third central moment; 
this is defined by the probability density function 



y} \ a 



(22) 



with ^ a location parameter and lo a scale parameter. 
The parameter 5 can be calculated from the estimated 
skewness using the relation 



1^1 = 




(23) 



The parameter a can be calculated from 5 with 5 = 
a/ {\/\ — (5^), and w can be calculated from the variance 
using (T^ — — 25^ /tt). These plots confirm what we 
saw above, that using only the mean and variance does 
not capture the full distribution in this case, and also 
including the skewness is not enough. When comparing 
the skewness of the p53 distribution with the skewness 
of the Michaelis-Menten enzyme kinetics system (Figure 
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FIG. 4. Analysis of distribution for p53 model, (a) Variance 
calculated based on SSA runs, (b) Variance calculated with 
SSA, LNA and moment approximation method, (c) Skew- 
ness calculated based on SSA runs (blue line) and skewness 
for normal distribution (cyan dashed line), (d-f) Histograms 
calculated based on SSA for points d, e and f in figure (c), and 
probability densitiy function of normal distribution calculated 
using mean and variance based on SSA (cyan line). 



2e) , we find that the maximum value of the skewness for 
both systems is approximately equal, and in both sys- 
tems the skewness does not have a large effect on the 
mean. 

We evaluate the influence of ignoring higher order mo- 
ments on parameter estimation with this system by look- 
ing at 2-dimensional contour diagrams of the distance d 
(which is closely related to the Gaussian likelihood^^) 
calculated via 



(24) 



where t are the time points of the trajectories and T the 
total number of time points included in the calculation 
(T = 800), Xi are the mean values of the SSA trajecto- 
ries for the three variables in the system at timepoints i, 
and /ii are the mean values of the trajectories calculated 
with the moment expansion method. Figures 5a-b show 
contour plots of the distances between the SSA trajec- 
tories and moment expansion trajectories calculated for 
varying values of parameters fci and in the p53 sys- 
tem (Eq. 20), Figures 5c-d show contour plots of the 
distances calculated for varying values of parameters fcs 
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FIG. 5. Contour plots for the p53 system of the distance 
d between SSA trajectories and trajectories calculated with 
the moment expansion method, a-b) Contour for varying k\ 
and fca, calculated using expansion up to 2 and 6 moments 
respectively, c-d) Contour for varying fcs and fee, calculated 
using expansion up to 2 and 6 moments respectively. Red 
dots indicate parameter values used for SSA simulations. 



and /cg. The parameters used for the SSA simulations 
{ki = 90, fca = 1.7, fcs = 0.93, fee = 0.96) are indicated 
with a red dot. From the contour plots corresponding to 
expansion up to two central moments (Figures 5a, c) we 
can see that the minimum distance is obtained for param- 
eters values that are not equal to the actual parameters 
used for the SSA simulations. This indicates that when 
expansion up to two moments is used for parameter in- 
ference of the p53 system, the wrong parameters will be 
found. However, when six central moments are included 
(Figures 5b, d), the minimum distance does correspond 
to the parameters used for the SSA calculations. 



D. Parameter Sensitivity Estimation 

Assessing parameter sensitivity is a key concern when 
fitting any parametric model^^'^^. Such analyses allow us 
to quantify how rapidly the outputs of our model change 
as we vary its parameters, which can provide insights into 
the robustness of the model and the relative influence 
that each parameter has upon the model's behaviour. 
However, sensitivity analyses of stochastic models can 
be difficult and/or computationally costly'^" and often 
involve simulating many times in order to obtain Monte 
Carlo estimates of sensitivity coefficients. The develop- 
ment of efficient methods for stochastic sensitivity anal- 
yses has been the focus of much recent research 

In the context of our proposed approach, a natural 
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and straightforward way to assess parameter sensitivity 
is to consider the rate at which the moments vary with 
the parameters. This motivates the calculation of simple 
sensitivity coefficients^*'^'* of the form Sij{t) ~ 9m^(t,e) ^ 
where TOj is the (estimated) i-th moment and 9j is the j- 
th parameter. Within our moment approximation frame- 
work, the .Sjj's may cither be estimated by perturbing 
the model's parameters and computing a finite difference 
approximation, or obtained automatically by employing 
the CVODES solver of Serban and Hindmarsh^^ when 
solving the system of ODEs (Equations (6) and (9)). In 
Figure 6, we reconsider the dimerisation model of Section 
III A. We focus upon the sensitivity of the mean and 2nd 
and 3rd central moments of the two molecular species 
to the parameter ki (similar results are obtaincxl for the 
sensitivity to ^2)- Figures 6a, d and g show how the mo- 
ments estimated from 100,000 SSA runs vary when we 
increase the original value of ki by 10 percent. Figures 
6b, e and h show the same for the moments estimated us- 
ing our proposed approach with 6 central moments (6m). 
Figures 6c, f and i show sensitivity coefficients estimated 
from both the SSA and 6m outputs using a finite differ- 
ence approximation (in the 6m case, the sensitivity coef- 
ficients may instead be obtained automatically using the 
CVODES solver, which yields identical results). There 
is generally good agreement between the coefficrients es- 
timated using the two different approaches. However, as 
we consider higher moments, our ability to assess sen- 
sitivity using the SSA output rapidly diminishes, since 
the variability caused by the change in the parameter 
value is overwhelmed by the variability in the estima- 
tor due to finite sample size. This may be rectified by 
increasing the number of SSA simulations, but at con- 
siderable computational cost. In contrast, the sensitivity 
coefficients associated with higher moments may still be 
straightforwardly calculated using the moment expansion 
approach (although, of course, care must be taken to en- 
sure that appropriately many moments have been taken 
into account by the approximation — see Section III E). 

E. Simple Heuristics for Moment Expansions 

Our results for the p53 system clearly demonstrate that 
failure to take a sufficient number of moments into ac- 
count can lead to incorrect conchisions and biased pa- 
rameter estimates. Ideally we would like to know from 
the outset whether a deterministic approach or a two 
moment approximation is sufficient to capture the gen- 
eral statistical behaviour of the stochastic system. But 
without recourse to a large number of SSA runs it is im- 
possible to predict the statistical properties of the solu- 
tions to non-linear stochastic systems. And in such cases 
it is generally not feasible to perform large numbers of 
SSA simulations and we need a different approach. We 
should look at the assumption made at the beginning of 
our derivation, where we assumed that we can approxi- 
mate the propensity functions with a Taylor expansion. 



For a single variable a Taylor expansion of a function 
f{x) about the point c has the general form 

m = m + ^^{x-c) + ^^{x-cf+... (25) 

By taking into account only the first term, we assume 
that the function value in point x is the same as for point 
c; by taking into account also the second term we assiime 
that f{x) can be approximated by a straight line between 
points X and c, etc.. Truncating the expansion at a low 
order will only result in a good approximation in case x 
is close to point c, where we have approximated the true 
function. In our case c is the mean, fi, implying that 
an approximation using a few moments will be accurate 
only in case all observations are close to the mean. In 
case it is possible to perform a single realisation of the 
SSA, we can assess this quality by comparing the mean 
calculated with the deterministic approach with the tra- 
jectory calculated with the SSA (Figure 7). 

Figure 7a displays the deterministic mean of xi and 
one SSA simulation for the dimerisation system. In this 
case the trajectory is close to the mean over the complete 
time course. A single trajectory for p53, displayed in Fig- 
ure 7c, compared to the deterministic result shows that in 
this case the distance of the trajectory from the mean is 
much larger. In case a single SSA realisation of the model 
is not possible, but experimental data arc available, the 
distance form the mean can still be investigated in the 
same way. Figures 7b, d show the mean calculated us- 
ing the deterministic approach together with 'measured' 
vahies at three time points (obtained with the SSA), re- 
peated three times, resulting in nine data points. Also 
when looking at the distance form these nine points from 
the deterministic mean, it is clear that for the dimerisa- 
tion system {x — ji) is small, whereas for the p53 system 
it is relatively large, indicating that a larger number of 
moments is necessary to capture the full distribution. 

Such simple heuristics have the advantage of being 
computationally affordable. While inadequate at guar- 
anteeing good performance of an expansion using any 
finite number of moments, we can use them to capture 
any gross inadequacy of a given approximation relatively 
reliably. Such small-scale analyses should precede or ac- 
company moment expansions. More generally, we can 
consider this problem from the point of view of statisti- 
cal model checking; see e.g.^^. But the question as to 
how many stochastic simulations need to be averaged 
over to get a good idea of the mean (or any higher mo- 
ment) is challenging to answer for all but the most trivial 
systems"^''. 



F. Computational Complexity 

The computational complexity of the moment approx- 
imation method depends on the number of variables in 
the system, and the number of terms that need to be 
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evaluated for each central moment. Because of the sym- 
metry of central moments, e.g. {{xi — ^j,i){x2 — H2)) = 
{ix2 — fJ-2){xi — (Ui)), there are many terms we do not 
need to include in the ODE representation of the sys- 
tem. The total number of central moment terms that 
could be nonzero when approximating a system with d 
variables and up to N moments is given by 



(N + d) 

Ncrn = -d-l 

Nidi 



(26) 



We subtract d terms because the first order central mo- 
ments are always zero, and one term corresponding to 
the zeroth order central moment. For the deterministic 
case, the total number of ODE equations necessary to 
describe the system is equal to the number of variables, 
each term describes the mean of one of the variables. For 
the LNA the total number of equations is equal to the 



number of equations needed for the two moment approx- 
imation. We displayed the number of central moment 
terms to be evaluated for systems up to 9 variables and 
up to 6 central moments in Table II. When two central 
moments are included for 2 variables, we need to evaluate 
two ODE equations corresponding to the mean, and three 
equations for the second central moments, i.e. the vari- 
ance of both variables, and their covariance. From Table 
II we can see that the number of equations that need to 
be included rises quickly with the number of variables in 
the system. 

However, in order to obtain the same information on 
higher order moments by simulation with the SSA, a 
large number of simulation runs would have to be per- 
formed; in our experience even for relatively simple sys- 
tems this is of the order of > 10^ — 10^. We have already 
seen from the examples described above that the num- 
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TABLE II. Number of potentially nonzero central moment 
terms to include in moment approximation for different num- 
bers of variables (columns) and number of central moments 
(rows) . 



N variables 





2 
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2m 
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10 


15 
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28 


36 


45 
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156 
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4m 
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31 
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6m 


25 


80 
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456 


917 
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2994 


4995 



ber of simulations necessary to obtain a smooth estimate 
of the higher order central moments increases with each 
order. For example, the third central moment in the 
dimerisation system calculated from 100,000 simulations 
still displays fluctuations that can only be expected to 
smooth out when more simulations were performed. 



IV. CONCLUSION 

In this paper we have described a general moment ex- 
pansion method for approximating the time evolution of 
stochastic kinetic systems. We have shown that taking 
into account more moments improves the estimate of the 
mean and the higher order moments. In case the deter- 
ministic approach delivers an accurate estimation of the 
mean, expanding the higher order moments still gives ad- 



ditional information on the variance, skewness and kurto- 
sis of the distribution of the variables. Even for very sim- 
ple systems, e.g. the dimerisation system, we find that 
higher moments obtained from averaging over 100,000 
SSA runs are still fluctuating noticeably. 

Instead of performing large numbers of SSA simula- 
tions, the time evolution of higher order moments may 
be obtained much more efficiently from moment approxi- 
mation methods. Unfortunately it is not possible to pre- 
dict a priori how many moments are required to fully 
capture the output distributions of stochastic processes. 
And while our method can be used to evaluate arbitrary 
higher moments, the number of equations that need to 
be solved increases rapidly with the number of molecular 
species and the number of moments required. This comes 
with increased computational cost for calculating the ex- 
pressions as well as for integrating the equations. There- 
fore it is desirable to identify whether more moments 
are needed to model the system, and we have provided 
a simple heuristic to achieve this in many cases. While 
this can indicate if more than two moments are needed, 
it will not clearly identify when sufficiently many central 
moments have been included in the expansion. This is a 
point for further future investigation. In cases where the 
distribution is known, but different from normal, knowl- 
edge about the distribution can be used to close the set 
of equations correspondingly. The estimated higher order 
moments may also by themselves indicate which distri- 
bution is a probable candidate for the output of a given 
dynamical stochastic system; we can, for example, com- 
pare obtained estimates of such moments against known 
values for the higher order moments for particular dis- 
tributions: if e.g. the estimated skewness and kurtosis 
are close to zero and three, respectively, a normal dis- 
tribution might be a good choice and an approximation 
similar to the simple linear noise approximation may be 
appropriate. 

When the model under investigation is non-linear, 
lower order moments typically depend on higher order 
moments, and the system of equations is not closed. Clo- 
sure can be achieved in several different ways. Here we 
have closed the system by truncating the Taylor series 
expansion at a selected order. Several other approaches 
have been previously evaluated. Chevalier et aP*, for ex- 
ample, approximated higher order moments using exper- 
imental data. Azunre et aP^ showed that for very small 
molecule numbers, using only two moments can lead to 
unstable results. Singh et al.^^ developed a derivative- 
matching approach in the context of polynomial rate 
equations, which proved to work very well in particular 
for small molecule numbers. For the examples described 
in this paper, the proposed method shows differences in 
behaviour for very low molecule numbers. More specif- 
ically, for the dimerisation system, the mean trajecto- 
ries become unstable below approximately 7 molecules 
in the system; for the Michaelis-Menten example, the 
system does not become unstable, but becomes less ac- 
curate for very small molecule numbers. The p53 system 
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trajectories for the system size as displayed in Fig. 3 
start running out of phase around < = 35, and when the 
molecule number decreases, the point where the trajec- 
tories run out of phase moves forward, followed by in- 
stability in the trajectories. The LNA solution for the 
p53 system does not become unstable for small molecule 
numbers but keeps showing the expanding oscillatory be- 
haviour. For other systems that show periodic behaviour 
it might be more beneficial to approximate the system in 
the frequency domain. We would recommend use of this 
method — certainly if no manual closure is attempted, 
but probably even then — for moderate and large num- 
bers of molecules; a conservative estimate might be to 
have at least 10-20 molecules. However, for molecular 
abundances of less than 10 molecules we find that the 
numerical burden of the SSA compared to the MFK ap- 
proaches is no longer prohibitive, and, again conserva- 
tively, we would consider the use of the SSA. 

The present work also provides a natural framework for 
an inferential framework: our results above suggest that 
inclusion of higher order moments will lead to increased 
accuracy of the parameter estimates. Milner et al.^° de- 
rived a likelihood function that included the mean and 
variance obtained from a moment closure method, as- 
suming a multivariate Gaussian distribution. Kuegler**^ 
used both the mean and the variance obtained from a mo- 
ment closure method for parameter estimation by mini- 
mizing an objective function that included both the dif- 
ference between the observed and estimated mc^an and 
the difi'erence between the observed and estimated vari- 
ance. This showed that more accurate parameter esti- 
mates could be obtained when the observed variance is 
included for parameter estimation. Other analyses have 
employed approximate Bayesian computation schemes^^, 
and used moment-based inference employing the mean 
and variance to infer parameters of a bimodal system^^. 
Through recent technological advances in the field of sin- 
gle cell observations^'^"^^, it becomes possible to probe 
directly the properties of the output distributions of 
stochastic dynamical systems over time. The additional 
information about the higher order central moments that 
can be derived from these datasets can be exploited when 
higher order empirical moments arc also used for param- 
eter inference. However, while likelihoods are trivially 
constructed when the mean behaviour and variance es- 
timates are available (via Gaussian assumptions), condi- 
tioning on higher-order moments typically requires some 
further assumptions; most easily these moments are in- 
cluded in maximum-entropy estimators of the probabil- 
ity distribution. The present approach yields these mo- 
ments, however, reliably and aS'ordably. Because of their 
affordability these moments also open up new ways for 
assessing the sensitivity of stochastic dynamical systems 
(as outlined in section HID), including cases where the 
linear noise approximation tends to break down^"^. This 
includes general feedback systems where the notion of 
sensitivity may be particularly useful but calculation for 
stochastic systems is fraught with problems and numer- 



ically expensive. In conclusion, the general moment ex- 
pansion method described in this paper provides a flexi- 
ble and valuable new tool for investigating many stochas- 
tic kinetic systems. 

Appendix A: Model equations 

In this section we display the model equations used 
for the dimerisation and Michaelis-Menten system. The 
complexity of the equations grows with the number of 
moments included, wc display here the equations used 
for the mean, variance and co-variance when truncating 
after second order. 

1. Dimerisation 

For the dimerisation system, the equations used for the 
mean, variance and covariance are given by 

Hxi = 2k2X2 — 2A:icr^2 — 2fcixi(xi — 1) 
Hx2 = - k2X2 + kixi{xi - 1) 

0-^2 = kiyl - kixi + k2X2 + cla^2 - 2A:2cr^2 - 2fci(T^^ ^.^ 
+ 4:kixial^^^^ 

<^lux2 = 2A;io-^,,^, + 2fc2CT^2 - fciCT^2 - k2(Tl^^^, - 2kixl 

- 2kiXi + 2^2X2 -I- 2fcicr^2 - AkiXial^ ,^^ + 2fcia;icr^2 

cr^2 = 4fcicr^2 + 4A;2(J^^ + 4kixl - AkiXi + Ak2X2+ 
4A;ia-^2 - 8fcia;icr^2 

2. Michaelis-Menten System 

For the Michaelis-Menten system, the equations used 
for the mean, variance and covariance are given by 

l^xi = -k2{xi +X2- 301) - kiul^^^^- 

A;i<T^2 - kiXi{xi +X2- 181) 
Mx2 = -csixi + X2- 301) 

= -2ci{al. + al^^^J - 03(0:1 + X2 - 301) 

<^1^,X2 = 181fcicr^i,x2 - ^2cr^2 - kicrl^^^^ - csal^^^^ 

- C3fT^2 - A;ia;io-^2 - 2kiXial^^^^ - kxX2(jl^^^^ 
(T^2 = -(ISlfciari - 301^2 -l- ^22:1 -|- A;2a;2 

- fci'^x„x2 - kicrl. - k,xl - 362fci<7^2 + 
2fc20-^i,x2 + 2^217^2 - kiXiX2 + 2kiXial^^^^ 
+ 4:kixial2 +2kiX2CTl2) 
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